A method of multi-layered speech segmentation tailored for speech synthesis

نویسنده

  • Takashi Saito
چکیده

This paper presents a speech segmentation scheme designed to be used in creating voice inventories for speech synthesis. Just the information about phoneme segments in a given speech corpus is not sufficient for speech synthesis, but multi-layers of segments such as breath groups, accent phrases, phonemes, and pitch-marks, are all necessary to reproduce the prosody and acoustics of a given speaker. The segmentation algorithm devised here has the capability of extracting the multi-layered segmental information in a distinctly organized fashion, and is fairly robust to speaker differences and speaking styles. The experimental evaluations with on speech corpora with a fairly large variation of speaking styles show that the speech segmentation algorithm is quite accurate and robust in extracting segments of both phonemes and accentual phrases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A VoiceFont Creation Framework for Generating Personalized Voices

This paper presents a new framework for effectively creating VoiceFonts for speech synthesis. A VoiceFont in this paper represents a voice inventory aimed at generating personalized voices. Creating wellformed voice inventories is a time-consuming and laborious task. This has become a critical issue for speech synthesis systems that make an attempt to synthesize many high quality voice personal...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Uniform Speech Parameterization for Multi-Form Segment Synthesis

In multi-form segment synthesis speech is constructed by sequencing speech segments of different nature: model segments, i.e. mathematical abstractions of speech and template segments, i.e. speech waveform fragments. These multi-form segments can have shared, layered or alternate speech parameterization schemes. This paper introduces an advanced uniform speech parameterization scheme for statis...

متن کامل

Fully automatic segmentation for prosodic speech corpora

While automatic methods for phonetic segmentation of speech can help with rapid annotation of corpora, most methods rely either on manually segmented data to initially train the process or manual post-processing. This is very time-consuming and slows down porting of speech systems to new languages. In the context of prosody corpora for text-to-speech (TTS) systems, we investigated methods for f...

متن کامل

A Comparative Study of Speech Segmentation and Preprocessing for Automatic Multi-lingual Recognition

Speech is the most intuitive way of communication between people after they were born, except those mutes and deaf-mutes. Hong Kong, a multicultural society, is an ideal place to develop a multilingual (Cantonese, Mandarin, and English) automatic speech recognition system. Once this happened, numerous techniques were explored of the three major stages on speech data: segmentation, preprocessing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005